System and method for simplifying and manipulating k-partite graphs

ABSTRACT

The system has a collection of a plurality of objects. Each object defines a node in a k-partite graph, such that, the nodes can e divided into a number of mutually exclusive sets such that all of the nodes are in exactly one of the sets; further edges occur only between nodes in different sets; The system also has a simplification process that aggregates one or more of the nodes into one or more categories and identifies a category node corresponding to each category. The category node inherits the mode and the edges of all the nodes in the respective category. Further, the system contains Directed Acyclic Graphs Indices (DAGIs) whose nodes may have a 1-1 mapping with the nodes in the k-partite graph. These indices can be used to aggregate and hide nodes in the k-partite graph. Aggregation occurs by selecting one or more non-leaf nodes in the DAGI and aggregating all descendent nodes. Hiding occurs by selecting some set of DAGI nodes, thus selecting some corresponding set of nodes in the k-partite graph, and requesting this set of nodes be hidden which effectively removes them from further consideration until they are restored by explicit request.

FIELD OF THE INVENTION

This invention relates to the field of information retrieval andanalysis. More specifically, the invention relates to visualization ofretrieved and/or analyzed information over a network.

BACKGROUND OF THE INVENTION

Graphs provide a powerful formalism and visual representation formodeling objects and their relationships. Informally, a graph is simplya collection of vertices or nodes, pairs of which are connected byedges. More formally, a graph is a set of vertices with an adjacencyrelation between vertices. The edges may be undirected (i.e., symmetric)or directed (i.e., asymmetric). In addition, weights may be attached tothe nodes, in which case the graph is called a network.

One type of graph is a k-partite graph. The definition of a k-partitegraph is a graph whose vertices can be partitioned into k disjoint setsso that no two vertices within the same set are adjacent. See Deo, N.,1974, entitled Graph theory with applications to engineering andcomputer science, P. 168-169, Prentice-Hall, Inc: Englewood Cliffs, N.J. A special case occurs when k=2; this type of graph is called abipartite graph. In a bipartite graph there are two sets and each nodeis a member of one set. Further, all of its connections are to nodes inthe other set.

Bipartite nodes are important in social network analysis where they arecalled two mode networks, affiliation networks, or actor networks SeeBorgatti, A. & Everett, M. G., 1997, entitled Network analysis of 2-modedata, available online athttp://www.analytictech.com/borgatti/2mode.htm. Commonly an affiliationnetwork is used to see the relationships between a group of people via aset of events in which they participate. When modeling or graphing theserelationships one set of nodes or mode, is the people. The other set ofnodes or mode is the events. Whenever a person participates in an eventthere is an edge connecting the two. The affiliation networks expressthe social relationships of the people involved, so that using themproperties can be derived about the people and the events. For example,which event attended by the most people, which person went to the mostevents, and which people and events are most central, i.e., do the bestjob of tying together the group.

A key problem in analyzing social networks is acquiring and storing thedata. Often this data is accumulated manually by having people fill outsurveys summarizing their participation in events then these data aretabulated. Alternatively, traces of people's social behavior can begleaned from computer records. For example, a system called Netscanreferenced in Xiong, R., Smith, M. A., and Drucker, S. dated October1998, entitled Visualizations of Collaborative Information forEnd-Users, Microsoft Technical Report No. MST-TR-98-52, also online at:

-   research.microsoft.com/˜sdrucker/papers/collabvizchi99.doc)    automatically scans Usenet archives and associates authors with the    messages they post. This graph is a 2-mode or bipartite since the    nodes can be divided into two sets, further, a node in one set only    connects to nodes in the other set. These graphs are visualized to    help Usenet users trace through connections between authors and    their postings.

The field of graph visualization, a subfield of informationvisualization, seeks to provide techniques and systems to aid in theinspection, navigation, and analysis of graphs. This includes thequestion of how to layout the graph so people can see the relationshipsbetween nodes and providing interfaces to allow these relationships tobe dynamically manipulated. A general goal of viewing and interactingwith graphs is providing the ability to focus in on regions of interest,while providing sufficient context or background to aid in theinterpretation of the foreground or focal information. A good survey ofgraph visualization techniques divides its review into 1) Graph layoutmethods: Deciding where to place the nodes and links; 2) Navigation andinteraction: How the user moves around the graph and manipulates it; and3) Clustering: Simplifying the graph by grouping or aggregating nodes.See Herman, I., Melancon, G., & Marshall, M., 2000, entitled Graphvisualization and navigation in information visualization: a survey inIEEE Transactions on Visualization and Computer Graphics 6(1), 24-43.).

An important operation to simplify graphs that is provided by manysystems is filtering. Filtering graphs mean removing nodes according toset criteria. For example, dynamic controls can be provided that selectwhich nodes should be retained. See Becker, R. A., Eick, S. G., & Wilks,A. R., 1995, entitled Visualizing network data. IEEE Transactions onVisualization and Computer Graphics. 1(1). 16-28).

Other powerful simplification operators use hierarchies, either implicitin the graph (intrinsic or structural) or defined elsewhere (extrinsic).These hierarchies can simplify graphs directly (e.g., only presentingthe remaining hierarchy) or by providing assistance in analyzing thegraph. A strict hierarchy or tree is defined as a directed graph whereevery node has exactly one parent or one node that points to it. A moreflexible hierarchy is a directed acyclic graph where nodes many havemore than one parent, but no cycles or loops exist in the graph. Onesystem that explored the use of hierarchies for graph visualizationextensively provides facilities for (1) aggregating the graph into itsbi-connected components, (2) viewing a spanning tree of the graph viaTreeMaps (a spacefilling version of a tree), and (3) extracting a subsetof the hierarchies to show a focal node and its nearby relatives inorder to provide a sense of context for the node that explains how itfits into the overall graph. See Rivlin, E., Botafogo, R. & Shneiderman,B. Navigating in hyperspace: Designing a structure-based toolbox.Communications of the ACM, 37:87-96, 1994.

A common graph visualization problem is how to label nodes in the graph.This problem is especially important when the nodes in a graph representlengthy text objects such as word processing documents or web pages.Solving this problem is similar to finding a brief summary for adocument. There are many well-known algorithms for extracting salienttext units from a document collection. One approach assumes that textunits with a uniform distribution over the collection of documents arenot salient and should be filtered out. Another approach is to see ifthe frequency of a text unit in the text is high relative to itsfrequency in a corpus of background text. See Moens, M. F., 2000,entitled Automatic Indexing and Abstracting of Document Texts. P. 89-97.Kluwer Academic Publishers:Boston, Mass. In this technique each term,made up of one or more consecutive words, is assigned a tf*idf weight,which stands for term frequency times inverse document frequency.

These references are herein incorporated by reference in their entirety.

PROBLEMS WITH THE PRIOR ART

The creation of k-partite graphs that describe the relationships betweenobjects, such as digital documents and people, is not well automated.Data to create these graphs is often acquired through survey orinterviews. These techniques are time-consuming and prone to error sincesubjects' self-report can differ from their actual practice. Further,they are difficult to update since notification that a change hasoccurred is often not made and finding out what changes requires redoingthe expensive interviewing or survey processes.

The simplification of k-partite graphs is difficult, relying either oncomplex querying systems that require the use of programming orelaborate specification, or is done manually. In the former case, thereare systems that can simplify graphs based on their structure, butsubstantial skill and training is needed to perform these operationssuch as the ability to program in a graph-oriented language. In thelater case, the simplifications are exceptionally time-consuming andlaborious requiring selecting the nodes by hand or using analysistechniques that are defined in terms of general graphs, rather thank-partite graphs.

Further, despite the availability of external directories of people andother extrinsic indices on the nodes, there are no straightforward toolsfor simplifying k-partite graphs using these indices. Finally, there areno straightforward tools for simplifying k-partite graphs that representsocial networks, including both people and their computationalartifacts, such as documents, using social network analyses.

In general, current systems fail to take account of the benefits ofanalyses based on the social networks that have accounted for thecreation, use, modification, or other history of digital objects. Forexample, systems fail to characterize and label objects by analyzingrelationships between their authors based on the authors relationshipsto other objects such as documents written.

The viewing and manipulation of large k-partitite graphs is not wellsupported in current graphical user interfaces for informationvisualization. In particular, systems fail to provide sufficientcontrols for maintaining focus on a subset of the graph. Further,systems fail to allow focusing on one set of objects while relegatingother objects to background status by, for example, suitably aggregatingthem, while still preserving relationships across the modes for thefocused objects.

Systems that create nodes in graphs corresponding to documents or otherobjects in digital repositories fail to derive short and useful labelsfor these nodes. The unavailability of short and useful labels makes itdifficult to get the “big picture” of the digital objects and theirrelationships in the graph when these objects are viewed in a graphicaluser interface.

OBJECTS OF THE INVENTION

An object of this invention is an improved system and for informationretrieval and processing.

An object of this invention is an improved system for informationretrieval and processing that represents the relationships betweendigital objects as a k-partite graph.

An object of this invention is an improved system and for informationretrieval and processing that represents the relationships betweendigital objects as a k-partite graph and directed acyclic graph indiceson those objects.

An object of this invention is an improved system and for informationretrieval and processing that represents the relationships betweendigital objects as a k-partite graph and directed acyclic graph indices(DAGI) on those objects, where the DAGI provides select, hide,aggregate, and categorize operations on the k-partite graph.

An object of this invention is an improved system and for informationretrieval and processing that represents the relationships betweendigital objects as a k-partite graph and directed acyclic graph indices(DAGI) on those objects where the k-partite graph is shown as a treecontrol. An object of this invention is an improved system and forinformation retrieval and processing with automatic aggregation.

An object of this invention is an improved system and for informationretrieval and processing by providing short title descriptions for nodesthat have text descriptions associated with.

SUMMARY OF THE INVENTION

The present invention is a system and method for information processing.The system has a collection of a plurality of objects. Each objectdefines a node in a k-partite graph, such that, each node is a member ofa set the k-partite graph having at least two nodes. Each node is inexactly one set. The nodes are connected by edges such that no node isconnected to another node in the same set. The system also has asimplification process that aggregates one or more of the nodes into oneor more categories and identifies a category node corresponding to eachcategory. The category node inherits the mode and the edges of all thenodes in the respective category.

In a preferred embodiment, the system also contains Directed AcyclicGraphs Indices (DAGIs) whose nodes may have a 1-1 mapping with the nodesin the k-partite graph. These indices can be used to aggregate and hidenodes in the k-partite graph according to its structure and using the1-1 mapping.

BRIEF DESCRIPTION OF THE FIGURES

The foregoing and other objects, aspects, and advantages will be betterunderstood from the following non limiting detailed description ofpreferred embodiments of the invention with reference to the drawingsthat include the following:

FIG. 1 is a block diagram on one preferred embodiment of the presentinvention.

FIG. 2 is a series of screen shots showing the graphical user interfaceand key functionalities via the preferred embodiment.

FIG. 3 is a data structure that defines a preferred k-partite graph.

FIG. 4, includes FIGS. 4A and 4B, where FIG. 4A is a data structure thatdefines a preferred directed acyclic graph index (DAGI), and FIG. 4B isa data structure that defines a preferred DAGI tree.

FIG. 5 is a data structure that defines a non-limiting example of ak-partite graph linked to multiple DAGIs.

FIG. 6 is a flowchart that provides a high level overview of the processof analyzing a k-partite graph including one preferred embodiment usinga DAGI.

FIG. 7 includes FIGS. 7A and 7B, where FIG. 7A is a flowchart describingthe creation of a DAGI based on a partial order analysis of thek-partite graph, and FIG. 7B is a flowchart describing the creation of aDAGI based on removing a mode (i.e., a set of nodes in the k-partitegraph) from the k-partite graph.

FIG. 8 includes FIGS. 8A and 8B, where FIG. 8A is a flowchart describingthe process of aggregation within a k-partite graph, and FIG. 8B is aflowchart describing the process of aggregation within a k-partite graphand directed via a DAGI.

FIG. 9 is a flowchart describing the process of reducing a k-partitegraph by removing a mode.

FIG. 10 is a flowchart describing a process of focusing the k-partitegraph by automatically aggregating nodes.

FIG. 11 is a flowchart describing the algorithm for deriving shortlabels for nodes that have text associated with them.

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1 is a block diagram of the computing environment in which thepresent invention is used in a non limiting preferred embodiment. Thefigure shows some of the possible hardware, software, and networkingconfigurations that make up the computing environment. The computingenvironment or system 100 comprises one or more general purposecomputers 170, 175, 180, 185, 190, and 195 interconnected by a network105. Examples of general purpose computers include the IBM Aptivapersonal computer, the IBM RISC System/6000 workstation, and the EBMparallel SP2. (These are Trademarks of the IBM Corporation.) The network105 may be a local area network (LAN), a wide area network (WAN), or theInternet. Moreover, the computers in this environment may support theWeb information exchange protocol (HTTP) and be part of a local Web orthe World Wide Web (WWW. Some computers (e.g., 195) may occasionally oralways be disconnected 196 from the network and operate as stand-alonecomputers.

Data objects 140 are any digital objects such as books, articles,reports, patents, web pages, recordings, relational data bases,object-oriented data bases, file systems, directories that contain text,images, video, audio, digital data, or any other multimedia objectand/or information and/or components thereof One or more data objectsare stored on one or more computers in the environment.

To find a particular data object in the environment, a query issubmitted for processing to a query processor 130 running on a computerin the environment. Query processors 130 are processes such as topicalsearch engines, data base query processors, directory search protocols,and any sort of interface that allows the specification of searchcriteria as an input and responds with digital data where that data iseither the requested data itself or pointers to network locations wherethe data can be obtained. The data objects collection 141 obtained viathe query processor may be in the form of a hit-list, database relation,or other type of data aggregate. A data object collection 141 maycomprise data objects located anywhere in the computing environment,e.g., spread across two or more computers. The process is well known inthe prior art. Examples of query processors 130 include Search Manager/2(a trademark of the IBM corporation) and DB2 (a trademark of IBMcorporation).

The result of the search is then analyzed by the k-partite graphgenerator 110 to identify the desired objects and their relationships.Based on this analysis additional queries may be made and additionaldata objects requested, retrieved, and analyzed. Based on inferredstructural relationships in the data returned the k-partite graphgenerator 110 creates a k-partite graph that represents the objects andthe relationships between them. For example, a document base is queriedand a bi-partite graph of documents and their authors is created, wherethere is a node for each document and each author, and edges betweendocuments and authors to indicate authorship. In the preferredembodiment there are three sorts of queries that can be made to such adocument base. First, a set of authors can be specified in order toretrieve a graph of those authors, the documents they authored, and allco-authors. Second, a single author can be specified and a crawl levelwhich is an integer greater than 1 specifying how many degrees ofseparation to extend the query. This retrieves a graph containing theauthor, her documents, the co-authors, the co-authors' documents, theco-authors of the co-authors' documents, and so on to the degree ofindirection given by the crawl level. Third, a keyword query can begiven that retrieves a graph containing documents that reference thosekeywords and the authors of those documents.

An important issue in forming the k-partite graph from the results ofsuccessive queries is to decide when two results refer to a singleinstance. For example, suppose a document is authored by “John Black”and another by “John D. Black”. The k-partite graph creation must decideif these both refer to the same individual, thus allocating a singlenode to cover both, or whether these are two different individuals, thusmeriting two different nodes. In the preferred embodiment heuristics sothat two names are held to refer to the same individual if both areknown to be affiliated with the same organization and the onlydifference is that one has a fully specified middle name and the otherhas only a middle initial where the middle initial and the first letterof the fully specified name are the same. This is an issue beyond namesand must be settled for all the various modes of the k-partite graph.

There the various fields and their purposes are discussed. The creationof the k-partite graph in terms of nodes pointing to other nodes viaedges relies largely on the prior art in that the method of connectingnodes in a k-partite graph can use the same techniques used for generalgraphs. See Deo, N., 1974, entitled Graph theory with applications toengineering and computer science, P. 270-273, Prentice-Hall, Inc:Englewood Cliffs, N. J. However, the k-partite graph does requireadditional fields for its correct manipulation. The fields are includedin FIG. 3 300 that defines a preferred embodiment for the k-partite datastructure. The use and manipulation of these fields in discussed belowreference to FIG. 3 300. This data structure is also referenced in FIG.1 300 in order to show that the k-partite graph generator creates,initializes, and manipulates the k-partite graph data structures 300.The Directed Acyclic Graph Index Generator 120 assembles indices on thenodes in the k-partite graph. The structure of these indices andadditional nodes within them are found by making additional queries tothe query processors 130. Continuing the example above, a query might beissued on each of the authors identified above, in order to place theminto an organizational hierarchy. The resulting organizational hierarchywould serve as a Directed Acyclic Graph Index (DAGI) on the k-partitegraph. The DAGI Generator stores the DAGIs into DAGI data structures asdescribed in FIG. 4 400. In addition, the DAGI generator process 120 isdescribed below as FIG. 7. For convenience, the k-partite graphgenerator 110 and DAGI generator 120 are shown here as separatecomponents. Note, however, that both systems may be components of thek-partite graph with DAGIs (KPG-DAGI) process 125 that is definedthrough the KPG-DAGI data structure which is shown in FIG. 5 500 anddescribed further below. The KPG-DAGI data structure is comprised of ak-partite graph 300 with one or more DAGIs, as described in FIG. 4 400,where the DAGI nodes may be in 1-1 correspondence with the k-partitenodes, thus allowing the DAGI to act as an index on the k-partite graphnodes. The Requirement that the nodes of the k-partite graph and DAGI bein correspondence does not mandate that they be part of a singlecomputing process, nor that they reside on the same processing element,as is discussed below. The KPG-DAGI process 125 provides the means tomanipulate and analyze the k-partite graph by performing aggregations,hiding k-partite graph nodes. These functions may be performedinteractively via a Graphical User Interface 200. See FIG. 7 (120) belowfor more detail.

Data objects 140 on one computer may be accessed over the network byanother computer using the Web (http) protocol, a networked file systemprotocol (e.g., NFS, AFS), or some other protocol. Services on onecomputer (e.g., query processors 120) may be invoked over the network byanother computer using the Web protocol, a remote procedure call (RPC)protocol, or some other protocol.

A number of possible configurations for accessing data objects, indexes,and services locally or remotely are depicted in the present figure.These possibilities are described further below. One configuration is astand-alone workstation 195 that may or may not be connected to anetwork 105. The stand-alone system 195 has data objects 140 and a queryprocessor 130 located locally. The stand-alone system 195 also has ak-partite with DAGI system 125 installed locally.

When the system is used, a query is input to the workstation 195 and theresults processed by the k-partite graph generator 110 and DAGIgenerator 120 using query processor 130. A second configuration is 185,a workstation with data objects, query processor, and analysis connectedto a network 105. This configuration is similar to the stand-aloneworkstation 195, except that 185 is always connected to the network 105.Also, the local query processor 130 may query local data objects 140and/or remote data objects accessed via the network 105, and utilizeeither a remote k-partite graph generator 110 and a local DAGI Generator120 or a remote k-partite graph generator 110 and a remote DAGIGenerator 120 accessed via the network 105. When queries are input atthe workstation 185, they may be processed locally at 185 using thelocal k-partite graph generator 110, local DAGI Generator 120, and localquery processor 130. Alternatively, the local k-partite graph generator110 and DAGI Generator 120 may access a remote query processor 130 (e.g.on system 175) via the network 105. Alternatively, the workstation 185may access a remote k-partite graph generator 110 and DAGI Generator 120via the network 105.

Another possible configuration is 175, a workstation with a queryprocessor 130 only. Computer 175 is similar to computer 185 with theexception that there are no local data objects 140. The local queryprocessor 130 accesses data objects 140 via the network 105. Otherwise,as in computer 185, the query processor 130, k-partite graph generator110, and DAGI Generator 120 may be accessed locally or remotely via thenetwork 105 when processing queries. Another possible configuration iscomputer 180, a workstation with data objects only. The data objects 140stored locally at computer 180 may be accessed by remote k-partite graphgenerator 110 and DAGI Generator 120 via the network 105. When queriesare entered at computer 180, k-partite graph generator 110, DAGIGenerator 120, and query processor 130 must all be accessed remotely viathe network 105.

Another possible configuration is computer 190, a client station with nolocal data objects 140, query processor 130, k-partite graph generator110, or DAGI Generator 120. When queries are entered at computer 190,k-partite graph generator 110, DAGI Generator 120, and query processor130 must all be accessed remotely via the network 105.

Another possible configuration is computer 170, a typical web server.Queries are entered at another workstation (e.g., 175, 180, 185, orpossibly 195) or a client station (e.g., 190) and sent for processing tothe web server 170 via the network 105. The web server 170 uses a remotek-partite graph generator 110, DAGI Generator 120 and query processor130 (accessed via the network 105) to process the query. Alternatively,one or more of these functions (110, 120, and 130) can reside on the webserver 170. The results are returned to the workstation or clientstation from which the query was originally sent.

FIG. 2 is a series of screen shots showings the graphical user interfaceand key functionalities via the preferred embodiment 200. The firstscreen shot 210 shows a bipartite graph (i.e., a k-partite graph wherek=2) where one mode is authors and the other mode is documents. Hence,all of the nodes are either author nodes 220 or they are document nodes230. Further, all of the edges 240 are between author nodes 220 anddocument nodes 230, so that none of them inter-connect author nodes 220or document nodes 230. Note that the author nodes 220 are distinctivelyshaped and shaded relative to the document nodes 230. Importantfunctions provided by the preferred embodiment of the Graphical UserInterface 200 include laying out the graph so it has an aestheticappearance; providing interactive controls for selecting nodes, draggingnodes, hiding nodes, aggregating nodes; and display options for whetherto display node labels, display arcs, zoom in and out, centering, etc.

In the preferred embodiment the graph layout is done through a prior artmethod that uses multidimensional scaling (MDS) to provide an aestheticlayout of the graph by basing the layout on the graph's internalstructure. The structure is characterized for each pair of nodes bytheir “graph distance” or the shortest path between every pair of nodes.Further, their visual relationship is calculated via their “Euclideandistance” or the distance on the screen between the nodes. A “stress”score is then computed by the difference of the two. An overall stressvalue is computed for each node by summing its relations to all of theother nodes. This stress is used to calculate the movement of each nodeby moving each node a fraction of its stress multiplied by a constant.This basic algorithm is elaborated by not considering all of the nodesfrom the beginning. Rather only log N of the nodes are arranged, where Nis the number of the nodes. After each series of iterations, double thenumber of nodes are arranged as the last time, where the new nodes areheuristically placed based on the position of the two nodes with theshortest graph distance. For the prior art description, see Cohen, J. D.(September, 1997). Drawing graphs to convey proximity: an incrementalarrangement method. ACM Trans. Comput.-Hum. Interact. 4, 3, pp. 197-229.Following the MDS layout an additional post-processing step is needed toensure that the nodes do not overlap. This is done by a sweep linealgorithm that sweeps from left to right and then from top to bottom,moving nodes-that overlap to the right or downward, respectively.

In the second screen shot 250 the DAGI appears 255. The DAGI appears asa tree control, even though it represents a directed acyclic graph. Thisis accomplished by representing the directed acyclic graph that definesthe DAGI using a tree. The difference between the trees and directedacyclic graphs is that all nodes in a tree have at most one parent (theroot does not have any parents). Directed acyclic graphs can berepresented as a tree by replicating portions. So, if a node A has twoparents B and C, a second copy of A is created, A′, so that B points toA and C points to A′. However, the replicated nodes still refer to thesame k-partite graph nodes. So, if A referred to k-partite graph node D,the replicated node A′ also points to D. This allows the tree control tomanifest the correct behavior. In the screen shot 250 the tree controlrepresenting the DAGI 255 has two nodes selected, Author A 260 andAuthor B 270. This causes the k-partite graph nodes referred to them toalso be highlighted. These k-partite graph nodes are Author A 265 andAuthor B 275. In the third screen shot 280 the effect of an aggregationis shown. The aggregation has been accomplished using the tree controlimplementing the DAGI so that Corporation 1 285 is a new category nodeand appears in the k-partite graph as a Corporation 1 290. The newcategory node Corporation 1 290 inherits the mode of the nodes itaggregates, hence it is considered an author node as well, thus isshaped and shaded as an author node. Further, the new category nodeCorporation 1 290 has a series of edges 295 that connect it to the samenodes to which Author A 265 and Author B 275 were connected.

In subsequent figures, a number of data-structures are described astables. This is for convenience of drawing and description. In an actualimplementation, any usual data-structure such as a normal array, anassociative array, a linked list, a hash table, or any other structuremay equivalently be used without affecting the invention describedherein.

The table in FIG. 3 is one preferred set of data structures used toimplement the present invention. In FIG. 3 the data structure 300represents a k-partite graph 300 as a linked list nodes. Each node has aname 310 that specifies both a system identifier and a printable label.The printable label is displayed by the Graphical User Interface 200 andis computed by the labeling process 615. A DAGI node refers to ak-partite graph node through this name 310.

Each node is connected to other nodes by edges. The edges 320 comprise alinked list where each edge 320 in the list specifies a destination nodeand the next edge 320 in the linked list. For flexibility the edges 320are doubly linked so each edge 320 also points to the preceding edge320. The nodes in a k-partite graph are partitioned into sets so thateach node is a member of exactly one set and no node points to any othernodes that is a member of its same set. Each set of a k-partite graph isalso referred to as a mode 330. This is convenient because the varyingsets or modes often have some semantics such as the set of authors orthe set of documents. The node's mode 330 is represented as a string.

In a preferred embodiment, nodes can be hidden or visible. The hiddenPfield 340 records whether or not a node is hidden. If the value is truethen the node is hidden, else it is visible. Similarly, a node may beselected. The selectedP field 350 records this state. If the value istrue then the node is selected. A node may be part of an aggregation.This is important to know since this influences how the node is furtherprocessed. For example, if it is part of an aggregation it should not beplaced into another aggregation as each node can be part of only oneaggregation. The inaggP field 360 records this state. If the value istrue then the node is part of an aggregation. The list of nodes that areaggregated needs to be stored so that when a disaggregation occurs theycan be properly updated. The inaggnext field 370 serves this purpose byproviding a way to form a linked list of nodes all of which participatein the same aggregation. It is important to know whether a node isdefined via an aggregation, further if a node is part of an aggregationthe nodes it has aggregated need to be stored. The aggP field 380 servesthese functions. If the value is null then the node is not part of anaggregation. If it is not null then it points to a node that it isaggregating, further, that node points to the rest of the nodes involvedin the aggregation via the inaggnext field 370. The complete set ofnodes involved in the k-partite graph is stored via a linked list. Thisallows the k-partite graph to be represented as a single node that linksthe other nodes in the graph. This linked list is maintained through thenext field 390.

FIG. 4, comprising FIGS. 4A and 4B, is one preferred set of datastructures used to implement the present invention.

In FIG. 4A the data structure 405 represents a DAGI. The DAGI is made upof a linked list of nodes much like the k-partite graph 300. Beyondbeing a linked list of nodes where each node has a name 410, each DAGInode or dnode 420 refers to a k-partite node or knode 430, thus a 1-1relationship may be established that is used for DAGI operationsincluding selection, aggregation, and hiding.

In FIG. 4B the data structure 435 represents a DAGI Tree, that is, aDAGI that is represented as a tree. When a tree is used to represent aDAGI some DAGI nodes could be replicated. Thus, in this embodiment, therelation of DAGI nodes to DAGI tree nodes is 1-many. The DAGI tree node435 includes a name 440 a dnode 450 which establishes the connection tothe k-partite graph via the DAGI 405 and a DAGI tree node or tnode 460.In addition, there is a next field 470 that connects all of the DAGItree nodes corresponding to a given DAGI node in a circular list. Thisallows a mapping from a DAGI tree node to all of the other DAGI treenodes that correspond to the same DAGI node.

The table in FIG. 5 is one preferred set of data structure used toimplement the present invention. In FIG. 5 the data structure 500represents a k-partite graph with associated DAGIs (KPG-DAGI). TheKPG-DAGI data structure 500 includes a k-partite graph as described inFIG. 3 300 and one or more DAGIs as described in FIG. 4 400. TheKPG-DAGI data structure records a name 510, its k-partite graph 520 andzero or more DAGIs through the DAGI field 530 and nextindex field 550.When at least one DAGI is present it is stored in the DAGI field 530.The inferredP field 540 records whether the DAGI was inferred from thestructure of the k-partite graph (i.e., intrinsic) or was it obtainedfrom some external source (i.e., extrinsic). This distinction may beimportant because intrinsic DAGIs are recalculated whenever they areneeded based on the current configuration of the k-partite graph (i.e.,what nodes are visible and what aggregate nodes exist) whereas extrinsicDAGIs are static. If more than one DAGI is associated with the KPG-DAGIdata structure the next DAGI is identified using the nextdagi field 550.. More detail on the k-partite graph data structure is found above onFIG. 3 300, while more detail on the DAGI data structure is found aboveon FIG. 4 400, and detail on the creation of the DAGI data structure isfound on FIG. 7 120.

FIG. 6 is a flowchart showing the method steps of one preferred processexecuted by the present invention. By executing the process in 125 thesystem 100 enables a k-partite graph to be investigated via a DAGI. Theprocess begins by a query being entered that retrieves a set of digitalobjects from which a k-partite graph is formed through invoking thek-partite graph generator 110. For example, in the case of searching adocument collection a query might ask to find documents by a given setof authors or give a set of keywords to use for a full text search onthe documents. From the retrieved documents a k-partite graph is createdthat links authors with the documents they have authored. (Seedescription of FIGS. 1 and 3, above.)

The next step is to invoke the DAGI generator 120 to DAGIs on variousmodes of the graph. For example, the documents can be organized intotopics. (See description of FIGS. 1, 4, and 5 above and FIG. 7 below.)Since a given document might fit into multiple topics it is not possibleto use a simple tree hierarchy, rather a directed acyclic graph isneeded.

In step 615 succinct labels are derived for nodes when the nodes areassociated with text and a succinct node label is not already available.This allows the nodes to be compactly viewed when the graph is drawn instep 620. Using the graphical user interface 200 users can request thatnodes be aggregated. Step 625 tests whether an aggregation request hasoccurred. If it has then step 630 directs that the aggregation beperformed. In step 635 a test occurs whether a request to remove a modehas occurred. If it has, step 640 accomplishes that removal. In step 645a test occurs whether a request to select or create an index hasoccurred. If it has, step 650 directs that the index be put into useincluding creating it, if necessary. The process of creating a DAGI isdescribed in FIG. 7 120. In step 660 a test occurs whether a request touse the DAGI for hiding, aggregation or disaggregation has occurred. Ifit has, step 670 directs this operation to happen.

FIG. 7, comprising FIGS. 7A and 7B, shows flowcharts showing the methodsteps of two preferred processes executed by the present invention. Byexecuting the processes specified in FIG. 7A or FIG. 7B a DAGI iscreated. These DAGIs are useful for simplifying and analyzing thek-partite graphs. These processes exemplify the creation of a DAGI whichwas listed as step 120 in FIGS. 1 and 6. In both FIGS. 7A and 7B theDAGIs being created are intrinsic that is they are based on thestructure of the k-partite graph. Other prior art intrinsic traversalsof the graph structure, not flowcharted here, divide the nodes intocategories, for example, by mode, by degree (i.e., number of nodes towhich they are connected), by connected components, by strongcomponents, or by bi-connected components. Other prior art intrinsictraversals of the graph structure, also not flowcharted here, aim tocapture more complex aspects of the k-partite graph's structure, forexample, giving a breadth-first traversal, giving a depth-firsttraversal, and reducing the k-partite graph to a directed acyclic graphthus removing only those nodes which result in cycles. These prior artanalyses of a graph's structure are described in Deo, N., 1974, entitledGraph theory with applications to engineering and computer science, P.268-327, Prentice-Hall, Inc: Englewood Cliffs, N. J.

The flowchart in FIG. 7A 705 is sufficiently similar and complex toillustrate how to create a DAGIs given that the structure of thek-partite graph has been analyzed as is provided, for example, by priorart for the cases listed above. The process shown in FIG. 7A 705 derivesa DAGI based on the implicit partial order of nodes in the k-partitegraph. The partial ordering is defined as a relation on the nodes withina given mode based on their connection to other nodes. If two nodes, xand y, are of the same mode, and x is connected to a superset of thenodes of y, then x is above y in the partial order. The order is partialbecause the ordering relation between x and y may not be defined, thatis, x may not be less or greater than y. This occurs if x is connectedto a node that y is not connected to and y is connected to a node towhich x is not connected. More formally, if next(x) is the set of nodesto which x is connected, and similarly next(y) for y, and the set formedby the expression union(next(x), next(y)) minus intersection(next(x),next(y)) is non-null then x and y are not ordered with respect to oneanother.

The process in 705 begins by creating a root node for the DAGI in step710. In step 720 nodes are created for each of the modes and those nodesconnected to the DAGI root node. In step 730 DAGI nodes are createdcorresponding to each k-partite node. The association between the twoare recorded using the data structure in FIG. 4 400. In step 740 thepartial order for each mode is computed. Pseudo code for accomplishingthis operation is as follows. In the psuedo code for

DAGI node N, kPartite(N) gives the k-partite graph node corresponding toit, degree(N) gives the number of nodes to which N is connected.  1. LetQueue Q = DAGI Nodes of 730; Let Queue L=null  2. Sort Q, least to most,based on the degree of their corresponding k-partite graph nodes  3.Unmark all the k-partite graph nodes  4. While (Q is not empty) {  5. N= First(Q); Q = Rest(Q)  6. Mark the nodes connected to kPartite(N)  7.Queue LL=L  8. Sort LL, most to least, based on the degree of thecorresponding k-partite graph nodes  9. While (LL is not empty) { 10. M= First(LL); LL = Rest(LL) 11. If all the nodes connected to kPartite(M)are marked { 12. Create an edge from N to M 13. Remove from LL all thenodes pointed to by M 14. } 15. } 16. Unmark the nodes connected tokPartite(N) 17. Add N to L 18. } 19. For all nodes N, such that N doesnot have an incoming edge, create  an edge, R to N, where R is the rootof the DAGI

A few points about how the psuedo code functions. In line 2, the DAGInodes are sorted from least to most so as to build the partial orderfrom the smallest to the highest. In line 8, the nodes that could beless than N are selected, since only a node with a lower degree canlower in the partial order. When a node is found to be lower an edge iscreated (line 12) and all nodes found to be lower than it, removed fromconsideration so redundant edges are not created (line 13). The partialorder described here, or lattice as it is called in social networkanalyses, is useful for finding relationships between people based ontheir linkages. In particular, the partial order can be used to estimateexpertise or segment a group of authors into communities.

A process to remove a mode from the k-partite graph and use the nodesand their relations (as indicated by the nodes' edges) is described inFIG. 7B 755. The first steps of the process are similar to the onesdescribed in FIG. 7A. In step 760 a node R is created to serve as rootof the DAGI. In step 770 nodes are created for each of the modes of thek-partite graph, except for the mode M that is being made into a DAGI,and link those nodes to R. Let N stand for the nodes in mode M that arebeing removed from the k-partite graph and used for the basis of theDAGI. In order to create a DAGI where the leaves of any given interiornode are all of the same mode, the nodes N must be divided based on themode of the nodes to which they are linked via their edges. Accordingly,in step 775 nodes are created that divide each node y in N into multiplenodes such that the number of resulting nodes is equal to the number ofdifferent modes to which node y is connected. These new nodes we notatey+m where y is original node and m is one of the modes. In step 780 eachnode y in N is again considered by examining each of its edges thatconnect the node y to some node z of mode m. For each such edge a newDAGI node is created to represent z in this context and an edge addedfrom y+m to z. This step completes the creation of the DAGI. In step 785all nodes of mode M are removed from the k-partite graph, thuscompleting the transfer of the mode to the DAGI.

An example clarifies the usefulness of the operation of removing ak-partite mode and using it as the basis of a mode as described inflowchart 755. Say a k-partite graph represents a document base with themodes author, organization, and document. Authors are connected to thedocuments they authored and organizations are connected to authors inwhich they are members. If the mode of author is turned into a DAGI anindex will result that can select, hide, or aggregate authors by theorganizations of which they are a part.

FIG. 8, comprising FIGS. 8A and 8B, shows flowcharts showing the methodsteps of two preferred processes executed by the present invention. Byexecuting the processes specified in FIG. 8 aggregation of nodes isaccomplished.

The flowchart in FIG. 8A 630 gives the process for aggregating a set ofnodes N in the k-partite graph when they are specified directly, thatis, without a DAGI. The flowchart in FIG. 8B 670 gives the process foraggregating nodes through a DAGI.

In FIG. 8A 630 the first step 810 is to create the category node A thatwill serve as the aggregate. In step 815 the category node A isconnected to the graph via a set of edges. The goal is to connect A tothe same nodes to which the nodes in N were connected, excludingconnections from one nodes in N to another node in N. In step 820 aconditional tests whether all of the nodes in N were of the same mode.If they were, step 825 assigns that mode to A. Otherwise, step 830assigns to N a new mode. The problem is that when the nodes in N are ofdifferent modes it is unclear to which mode should be assigned to A. Instep 835 the nodes in N get their flags set indicating they are part ofan aggregation and consequently should be hidden. Finally, in step 840the nodes in N are compiled into a linked list using 390 and the firstelement assigned to A using 380.

The process in FIG. 8B 670 considers the case where a number of nodes inthe DAGI have been selected for aggregation. Only interior nodes (i.e.,non-leaf nodes) are eligible for aggregation since leaf nodes have nonodes to aggregate. In step 850 those DAGI nodes selected foraggregation are ordered via a preorder traversal of the DAGI Tree. Thisordering is important in order to ensure that nested aggregations takeplace correctly. Consider the case where P is aggregating Q and R, and Qis aggregating Y and Z. If P is aggregated before Q then the Qaggregation can not occur. The preorder sorting ensures that in thiscase the Q aggregation would happen before the P aggregation.Essentially a preorder sorting puts the more nested nodes before theless nested ones. In step 860 each of the individual aggregations areaccomplished by finding the nodes corresponding to DAGI nodes in thek-Partite graph and then calling the aggregation process 630.

FIG. 9 shows a flowchart showing the method steps of a preferredprocesses executed by the present invention. By executing the processspecified in FIG. 9 640 the k-partite graph has one of its modesremoved. A mode is a set of one or more nodes in the k-partite graphsuch that no node in a given mode is connected to another node in thesame mode. In general, a k-partite graph with j modes can be consideredsimpler than a k-partite graph with j+1 modes, all other things beingequal. Hence, removing a mode is a primary way to simplify a k-partitegraph. The basic idea is to remove all the nodes of a given mode M, butretain as much of the structural information represented by those nodesas possible by adding new edges that connect together nodes that wereindirectly linked by nodes in M. For example, if there is a node of modeM called y and it were connected to a node x and a node z, then whenmode M is eliminated x would be connected to z (if it were not already)before N is eliminated. However, this connection can not occur if x andy are part of the same mode. In step 910 the nodes N belonging to themode M being eliminated are identified. See FIG. 3 above. In step 920each of the possible new edges, x to z, are raised for consideration andcreated as long as the result would not be an edge connected two nodesof the same mode. In step 930 all the nodes of mode M are removed. Thisprocess is further specified through the following pseudo-code.  1.Given mode M is being removed, identify the Nodes N being removed  2.For each node y in N {  3. For each edge (x, y) {  4. For each edge (y,z) {  5. If mode(x) != mode(z)  6. Create an edge (x, z)  7. }  8. }  9.} 10. For each node y in N { 11. Remove y 12. }

The utility of this operation is illustrated using the same example asabove. Say a k-partite graph represents a document base with the modesauthor, organization, and document. Authors are connected to thedocuments they authored and organizations are connected to authors inwhich they are members. If the mode of author is removed thenorganizations will be connected to documents where a member of thatorganization served as an author.

FIG. 10 shows a flowchart showing the method steps of a preferredprocesses executed by the present invention. By executing the processspecified in FIG. 10 650 the k-partite graph is simplified via automatedaggregations using a DAGI. The goal is to provide an easy way for theuser to specify what parts of the k-partite graph are of interest andthen to automatically background the rest of the nodes by suitablyaggregating them. The process begins in step 1010 by selecting somesubset of the DAGI nodes as focal nodes; let this set of focal nodes becalled D. Further, let D′ be the set of nodes in the k-partite graphreferred to by nodes in D. These nodes D′ should not be aggregated sincethese are the nodes of primary interest. This is accomplished by hidingthe nodes in D′, thus guaranteeing they can not be aggregated until theyare unhidden. In step 1020 a list of nodes to aggregate is begun calledA. The parents of D are added to this list. This means if a node P hasnodes Q, R and S as children and S was selected as a focal node, then Pwill be aggregated thereby forming a category node containing only Q andR, but not S. A further elaboration in step 1020 is that the ancestorsof D are marked indicating that none of them should be aggregated. Thisstep is necessary to ensure that no aggregations of aggregations occur.In step 1030 an upward sweep is made beginning from D which is theparents of the focal nodes. The upward sweep aims to not aggregate anyaggregations, but to aggregate everything else. This is accomplished byaggregating the siblings of the parents and then the siblings of thegrandparents and so on. Aggregating only the siblings and not theancestors themselves means that aggregations of aggregations will notoccur. Further guarantee of that is needed since directed acyclic graphscan have cross links that make a given node both an ancestor and asibling of another given node. This guarantee is provided by explicitlymarking all nodes which should not be aggregated, namely, ancestors ofaggregated nodes. In step 1040 the unmarked nodes in A are aggregated.Finally, in step 1050 the focal nodes are unhidden thus showing them inrelation to the aggregated nodes. This process is further described bythe following psuedo code.  1. Let D be the focal DAGI nodes and D′ becorresponding k-Partite nodes  2. for each node N in D′ {  3. hide N  4.}  5. D = Parents of D  6. List A = D  7. Mark the ancestors of D  8.while (D != null) {  9. S = Siblings of D 10. Add S to A 11. Mark theancestors of S 12. D = Parents of D 13. } 14. foreach node n in A { 15.if unmarked(i) 16. Aggregate(n) 17. } 18. foreach node n in D′ { 19.unhide(n) 20. }

FIG. 11 shows a flowchart showing the method steps of a preferredprocesses executed by the present invention. By executing the processspecified in FIG. 11 615 nodes associated with text are assigned a shortlabel that aids in providing a overall view of the k-partite graph. Thechallenge is finding short titles that provide useful summaries. Thebasic idea is to take advantage of titles by extracting two terms fromthe title while using normalized weights derived from the textassociated with all of the nodes. A term is either an individual word orconsecutive word pair. The choice of which terms to extract from thetitle is based on an analysis of all the words associated with a node(e.g., the full text of a document) in concert with an analysis of allthe words associated with all of the nodes in the graph. The processbegins with step 1110 where the title and full text is obtained for eachnode. In step 1120 the title and fill text is broken up into terms(i.e., single words and consecutive word pairs) and the occurrence ofthese terms counted. The occurrence of terms is counted per document anda count is kept of how many of the documents associated with nodescontained the term at least once. In step 1130 normalized weights arecomputed for each term. The computation of the weights uses thefollowing prior art formula:$\frac{{{tf}(i)}*{\log\left( {N/{n(i)}} \right)}}{{sqrt}\left( {{sum}\left\lbrack {{{forall}\quad j},{\left( {{{tf}(j)}*{\log\left( {N/{n(j)}} \right)}} \right)^{\hat{}}2}} \right\rbrack} \right)}$where tf(i) is the term frequency of term i in the current document

-   -   N is the # of documents    -   n(i) is the number of documents in which term i appeared at        least once    -   tf(j) is the term frequency of term j in the current document,        ranging over all j

Finally in step 1140 a label is selected by rank ordering the terms byweight taking the top N where N is the desired label size. The preferredembodiment uses N=2. The resulting printable label is displayed by theGraphical User Interface 200 within or next to the node 220 whenrendering the k-partite graph 125.

1. A computer system for information processing comprising: a collectionof a plurality of objects, each object defining a node in a k-partitegraph, such that, the nodes can be divided into a number of mutuallyexclusive sets such that each of the nodes are in exactly one of thesets, the k-partite graph further comprising one or more edges occurringonly between nodes in different sets, each of the sets being a mode; anda simplification process that aggregates one or more of the nodes of oneof the sets into one or more category nodes, where the category nodebecomes a member of the respective set and the category node inheritsthe edges of all the aggregated nodes in the respective set.
 2. Acomputer system, as in claim 1, where the category node newly created.3. A system, as in claim 1, where the nodes in the respective set thathave been aggregated are removed while one or more of the category nodesremain.
 4. A system, as in claim 1, where the simplification processreduces a k-partite graph with N distinct modes to a k-partite graphwith N minus one modes.
 5. A computer system for information processingcomprising: a collection of a plurality of objects, each object defininga node in a k-partite graph, such that, the nodes can be divided into anumber of mutually exclusive sets such that each of the nodes are inexactly one of the sets, the k-partite graph further comprising one ormore edges occurring only between nodes in different sets, each of thesets being a mode; a simplification process that aggregates one or moreof the nodes of one of the sets into one or mere category nodes, wherethe category node becomes a member of the respective set and thecategory node inherits the edges of all the aggregated nodes in therespective set; and a directed acyclic graph index (DAGI) that containsone or more DAGI nodes connected to one another by one or more DAGIedges, one or more of the DAGI nodes corresponding to one or more of thecategories, and one or more of the DAGI nodes corresponding to one ormore of the nodes in the k-partite graph, and the DAGI edgesestablishing a hierarchy of containment of the DAGI nodes, and thesimplification process selecting one or more non-leaf DAGI nodes as oneof the category nodes and the selection process using one or moredescendent nodes of the non-leaf node to identify the nodes aggregatedin the respective category.
 6. A computer system for informationprocessing comprising: a collection of a plurality of objects, eachobject defining a node in a k-partite graph, such that, the nodes can bedivided into a number of mutually exclusive sets such that each of thenodes are in exactly one of the sets, the k-partite graph furthercomprising one or more edges occurring only between nodes in differentsets, each of the sets being a mode; a simplification process thataggregates one or more of the nodes of one of the sets into one or morecategory nodes, where the category node becomes a member of therespective set and the category node inherits the edges of all theaggregated nodes in the respective set; and a directed acyclic graphindex (DAGI) that contains one or more DAGI nodes connected to oneanother by one or more DAGI edges, one or more of the DAGI nodescorresponding to one or more of the categories, and one or more of theDAGI nodes corresponding to one or more of the nodes in the k-partitegraph, and the DAGI edges establishing a hierarchy of containment of theDAGI nodes, and the simplification process selecting one or morenon-leaf DAGI nodes as one of the category nodes and the selectionprocess using one or more descendent nodes of the non-leaf node toidentify the nodes aggregated in the respective category, where thesimplification process selects one or more of the DAGI nodes, causingone or more of the k-partite graph nodes to be selected, insofar as theDAGI nodes are in 1-1 correspondence with the k-partite graph nodes, andthen causing the selected k-partite nodes to be hidden so that they aredeleted for future processing.
 7. A system, as in claim 6, where objectsare any one or more of the following: a document, an e-mail message, apatent, a digital media asset, a video, an audio clip, a UseNet article,a web page, a person, an organization, a company, and any object towhich multiple attributes can be assigned.
 8. A system, as in claim 6where the set includes any one or more of the following: a documentauthor, an organization or company, a document title, a documentsubject, a structural property such as nodes with degree two, a nodescomprising connected components, and any category that is useful indescribing an aggregate set of objects.
 9. A system, as in claim 6,where the one or more of the non-leaf DAGI nodes is a newly creatednon-leaf DAGI node.
 10. A system, as in claim 6, where the DAGI iscreated from analyzing the structure of the k-partite graph.
 11. Asystem, as in claim 10, where the structure is analyzed by one of thefollowing: the degree of the nodes, a reduction of the k-partite graphto a tree by a depth-first traversal, a reduction of the k-partite graphto a tree by a breath-first traversal, a reduction of the k-partitegraph to a directed acyclic graph index (DAGI), a reduction of thek-partite graph to one or more connected components, a reduction of thek-partite graph to one or more strongly connected components, areduction of the k-partite graph to a partial ordering where the nodesof each mode are placed into a partial order such that a node X isconsidered to be greater than another node Y, if the set of nodes towhich it is connected is a superset of the nodes to which Y isconnected, a reduction of k-partite graph containing X modes, such thatX>2, to a k-partite graph containing X minus one modes with the removednodes and edges used to define the DAGI.
 12. A computer system forinformation processing comprising: a collection of a plurality ofobjects, each object defining a node in a k-partite graph, such that,the nodes can be divided into a number of mutually exclusive sets suchthat each of the nodes are in exactly one of the sets, the k-partitegraph further comprising one or more edges occurring only between nodesin different sets, each of the sets being a mode; a simplificationprocess that aggregates one or more of the nodes of one of the sets intoone or more category nodes, where the category node becomes a member ofthe respective set and the category node inherits the edges of all theaggregated nodes in the respective set; a directed acyclic graph index(DAGI) that contains one or more DAGI nodes connected to one another byone or more DAGI edges, one or more of the DAGI nodes corresponding toone or more of the categories, and one or more of the DAGI nodescorresponding to one or more of the nodes in the k-partite graph, andthe DAGI edges establishing a hierarchy of containment of the DAGInodes, and the simplification process selecting one or more non-leafDAGI nodes as one of the category nodes and the selection process usingone or more descendent nodes of the non-leaf node to identify the nodesaggregated in the respective category, where the simplification processselecting one or more of the DAGI nodes, causing one or more of thek-partite graph nodes to be selected, insofar as they are in 1-1correspondence, and then causing the selected k-partite nodes to behidden so that they are effectively deleted for future processing; andone or more displays for displaying the k-partite graph.
 13. A system,as in claim 12, where the display further displays the DAGI.
 14. Asystem, as in claim 12, where the simplification process does notdisplay the nodes in the category but does display the category nodes.15. A system, as in claim 12, further comprising a layout process fordetermining the placement of the nodes and the edges.
 16. A system, asin claim 15, where the DAGI is rendered as a tree control.
 17. A system,as in claim 15, where the identification of the category node isdetermined by selection of one or more of the DAGI nodes using the treecontrol.
 18. A system, as in claim 1, further comprising a plurality oftext descriptions, each text description associated with one object andeach object optionally having a text description.
 19. A system, as inclaim 1, further comprising a labeling process for determining aoptional succinct text label for each node in the k-partite graph.
 20. Acomputer process for information processing comprising the steps of:collecting of a plurality of objects, each object defining a node in ak-partite graph, such that, the nodes can be divided into a number ofmutually exclusive sets such that each of the nodes are in exactly oneof the sets, the k-partite graph further comprising one or more edgesoccurring only between nodes in different sets, each of the sets being amode; aggregating one or more of the nodes of one of the sets into oneor more category nodes, where the category node becomes a member of therespective set and the category node inherits the edges of all theaggregated nodes in the respective set; and creating a directed acyclicgraph index (DAGI) that contains one or more DAGI nodes connected to oneanother by one or more DAGI edges, one or more of the DAGI nodescorresponding to one or more of the categories, and one or more of theDAGI nodes corresponding to one or more of the nodes in the k-partitegraph, and the DAGI edges establishing a hierarchy of containment of theDAGI nodes, and the simplification process selecting one or morenon-leaf DAGI nodes as one of the category nodes and the selectionprocess using one or more descendent nodes of the non-leaf node toidentify the nodes aggregated in the respective category.
 21. A computersystem for information processing comprising: means for collecting of aplurality of objects, each object defining a node in a k-partite graph,such that, the nodes can be divided into a number of mutually exclusivesets such that each of the nodes are in exactly one of the sets, thek-partite graph further comprising one or more edges occurring onlybetween nodes in different sets, each of the sets being a mode; meansfor aggregating one or more of the nodes of one of the sets into one ormore category nodes, where the category node becomes a member of therespective set and the category node inherits the edges of all theaggregated nodes in the respective set; and means for creating adirected acyclic graph index (DAGI) that contains one or more DAGI nodesconnected to one another by one or more DAGI edges, one or more of theDAGI nodes corresponding to one or more of the categories, and one ormore of the DAGI nodes corresponding to one or more of the nodes in thek-partite graph, and the DAGI edges establishing a hierarchy ofcontainment of the DAGI nodes, and the simplification process selectingone or more non-leaf DAGI nodes as one of the category nodes and theselection. process using one or more descendent nodes of the non-leafnode to identify the nodes aggregated in the respective category.
 22. Acomputer program memory product having a process for informationprocessing comprising the steps of: collecting of a plurality ofobjects, each object defining a node in a k-partite graph, such that,the nodes can be divided into a number of mutually exclusive sets suchthat each of the nodes are in exactly one of the sets, the k-partitegraph further comprising one or more edges occurring only between nodesin different sets, each of the sets being a mode; aggregating one ormore of the nodes of one of the sets into one or more category nodes,where the category node becomes a member of the respective set and thecategory node inherits the edges of all the aggregated nodes in therespective set; and creating a directed acyclic graph index (DAGI) thatcontains one or more DAGI nodes connected to one another by one or moreDAGI edges, one or more of the DAGI nodes corresponding to one or moreof the categories, and one or more of the DAGI nodes corresponding toone or more of the nodes in the k-partite graph, and the DAGI edgesestablishing a hierarchy of containment of the DAGI nodes, and thesimplification process selecting one or more non-leaf DAGI nodes as oneof the category nodes and the selection process using one or moredescendent nodes of the non-leaf node to identify the nodes aggregatedin the respective category.